Differentiating Individuals with and without Alcohol Use Disorder Using Resting-State fMRI Functional Connectivity of Reward Network, Neuropsychological Performance, and Impulsivity Measures

Individuals with alcohol use disorder (AUD) may manifest an array of neural and behavioral abnormalities, including altered brain networks, impaired neurocognitive functioning, and heightened impulsivity. Using multidomain measures, the current study aimed to identify specific features that can differentiate individuals with AUD from healthy controls (CTL), utilizing a random forests (RF) classification model. Features included fMRI-based resting-state functional connectivity (rsFC) across the reward network, neuropsychological task performance, and behavioral impulsivity scores, collected from thirty abstinent adult males with prior history of AUD and thirty CTL individuals without a history of AUD. It was found that the RF model achieved a classification accuracy of 86.67% (AUC = 93%) and identified key features of FC and impulsivity that significantly contributed to classifying AUD from CTL individuals. Impulsivity scores were the topmost predictors, followed by twelve rsFC features involving seventeen key reward regions in the brain, such as the ventral tegmental area, nucleus accumbens, anterior insula, anterior cingulate cortex, and other cortical and subcortical structures. Individuals with AUD manifested significant differences in impulsivity and alterations in functional connectivity relative to controls. Specifically, AUD showed heightened impulsivity and hypoconnectivity in nine connections across 13 regions and hyperconnectivity in three connections involving six regions. Relative to controls, visuo-spatial short-term working memory was also found to be impaired in AUD. In conclusion, specific multidomain features of brain connectivity, impulsivity, and neuropsychological performance can be used in a machine learning framework to effectively classify AUD individuals from healthy controls.


Introduction
The drugs of abuse, including alcohol, exert and maintain their reinforcing effects through the reward circuitry of the brain [1,2]. Neuroimaging studies have documented the disruption of reward processing in addiction [3] and implicated brain reward circuitry in different

Materials and Methods
The study protocol is illustrated in Figure 1. The sample consisted of 30 abstinent individuals with past diagnosis of AUD and 30 healthy volunteers (CTL) (see Section 2.1 for details). The analytic measures included rs-fMRI (Sections 2.4-2.6), self-rated impulsivity scores (Section 2.3), and neuropsychological test scores (Section 2.2). Major analyses employed in the study were (i) feature selection to extract relevant features that will be used in classification analysis (Section 2.7), (ii) random forest method to identify the key features that significantly contribute to classifying AUD from the CTL participants (Section 2.8), (iii) zero-order correlations were used to compute associations of the key variables identified by the random forest classification model, (a) among themselves (Section 3.2) and (b) with age in each group (Section 3.3). Partial correlations were employed to identify associations between age and the key variables of classification by controlling the group effects in the total sample (Section 3.3). The study protocol listing the sample, measures, and analytic techniques. The sample consisted of two groups of 30 individuals each, viz., AUD and CTL. The measures used in the prediction model included rs-fMRI functional connectivity (reward network), impulsivity assessed with Barratt impulsiveness scale (BIS), and neuropsychological performance scores. Major analyses were features selection for selecting FC variables, random forest classification method, and correlational analyses, including zero-order and partial correlations.

Sample
The sample characteristics are presented in Table 1, and a detailed description is also available in Pandey et al. [54]. All participants in the current study were drawn from the sample of a larger study on brain dysfunction in chronic alcoholism conducted at the SUNY Downstate Health Sciences University, Brooklyn, NY, USA. Thirty currently abstinent adult males with past AUD (mean age (SD) = 41.42 (7.31) years) and thirty unaffected male controls (mean age (SD) = 27.44 (4.74) years), who had undergone multimodal assessments, including structural and functional MRI and neuropsychological tests, were selected for the present study. The "race/ethnic" distribution of the sample was: African Ancestry = 25; European Ancestry = 9; Asian = 21; American Indian = 1; More than one race = 2; and Unknown = 2. Participants with AUD were recruited from alcohol treatment centers in and around New York City after they had been detoxified and abstinent for at least 30 days prior to testing. As shown in Table 1, some of the participants from the AUD group had consumed tobacco (N = 20) and/or marijuana (N = 10) during the last 6 months The study protocol listing the sample, measures, and analytic techniques. The sample consisted of two groups of 30 individuals each, viz., AUD and CTL. The measures used in the prediction model included rs-fMRI functional connectivity (reward network), impulsivity assessed with Barratt impulsiveness scale (BIS), and neuropsychological performance scores. Major analyses were features selection for selecting FC variables, random forest classification method, and correlational analyses, including zero-order and partial correlations.

Sample
The sample characteristics are presented in Table 1, and a detailed description is also available in Pandey et al. [54]. All participants in the current study were drawn from the sample of a larger study on brain dysfunction in chronic alcoholism conducted at the SUNY Downstate Health Sciences University, Brooklyn, NY, USA. Thirty currently abstinent adult males with past AUD (mean age (SD) = 41.42 (7.31) years) and thirty unaffected male controls (mean age (SD) = 27.44 (4.74) years), who had undergone multimodal assessments, including structural and functional MRI and neuropsychological tests, were selected for the present study. The "race/ethnic" distribution of the sample was: African Ancestry = 25; European Ancestry = 9; Asian = 21; American Indian = 1; More than one race = 2; and Unknown = 2. Participants with AUD were recruited from alcohol treatment centers in and around New York City after they had been detoxified and abstinent for at least 30 days prior to testing. As shown in Table 1, some of the participants from the AUD group had consumed tobacco (N = 20) and/or marijuana (N = 10) during the last 6 months (but not at least 5 days before testing). None of the participants were in withdrawal for alcohol or any other drugs Behav. Sci. 2022, 12, 128 4 of 26 (including nicotine) at the time of testing. Individuals for the control group (CTL) were recruited through advertisements and screened to exclude any personal or family history of major medical, psychiatric, or substance-related disorders. The CTL participants did not have any past or present history of substance dependence or abuse (DSM-IV), although some of them (N = 12) were light/regular drinkers and had used alcohol in the last 6 months (N = 18) (see Table 1 for details). All participants were asked to abstain from alcohol and other drugs for 5 days prior to MRI scans. Clinical information regarding substance use, psychiatric disorder, and family history were assessed using a modified version of the semi-structured assessment of genetics of alcoholism (SSAGA) [55]. The majority of subjects were right-handed, with only a few who were either left-handed (5 in AUD and 2 in CTL group) or bi-dexterous (2 in AUD and 1 in CTL group). Clinical and psychometric data were collected at the SUNY Downstate Health Sciences University, while the fMRI data were acquired at the Nathan Kline Institute (NKI) for Psychiatric Research. Standard MRI safety protocols and exclusion criteria (implants, tattoos, cosmetics, claustrophobia, etc.) were followed to ensure subjects' safety and data quality. Individuals with hearing/visual impairment, a history of head injury, or moderate and severe cognitive deficits (<21) on mini-mental state examination (MMSE) [56] were also excluded from the study. All participants provided informed consent and the Institutional Review Boards of both centers approved the research protocols (IRB approval ID: SUNY-266893; NKI-212263).

Neuropsychological Assessment
Participants were administered two computerized tests from the Colorado assessment tests for cognitive and neuropsychological assessment [57], namely, the Tower of London Test (TOLT) [58], and the visual span test (VST) [59,60].

Tower of London Test (TOLT)
The TOLT assesses planning and problem-solving ability of the executive functions. In this test, participants were asked to solve a set of puzzles with graded difficulty levels by arranging a number of colored beads one at a time from a starting position to the desired goal position in as few moves as possible. The test consisted of 3 puzzle types, with 3, 4, and 5 colored beads placed on the same number of pegs, with 7 problems/trials per type and a total of 21 trials. The following performance measures from the sum total of all puzzle types were used in the analysis: (i) excess moves, which is the additional moves beyond the minimum moves required to solve the puzzle ("ExcMovMade_All"); (ii) average pickup time, which is the initial thinking/planning time spent until picking up the first bead to solve the puzzle ("AvgPicTime_All"); (iii) average total time, which is the total thinking/planning time to solve the problem in each puzzle type ("AvgTotTime_All"); (iv) total trial time, which is the total performance/execution time spent on all trials within each puzzle type ("TotTrlTime_All"); and (v) average trial time, which is the mean performance/execution time across trials per puzzle type ("AvgTrlTime_All").

Visual Span Test (VST)
The VST measured visuospatial memory span from the forward condition and working memory from the backward condition. In this test, a set of randomly arranged squares, ranging from 2 to 8 squares per trial, flashed in a predetermined sequence depending on the span level being assessed. Each span level was administered twice, with a total of 14 trials in each condition. During the forward condition, subjects were required to repeat the sequence in the same order by clicking on the squares using a computer mouse. In the backward condition, subjects were required to repeat the sequence in the reverse order (starting from the last square). The following performance measures were collected during forward and backward conditions: (i) total number of correctly performed trials ("Tot-Cor_Fw" and "TotCor_Bw"); (ii) maximum span or sequence-length achieved ("Span_Fw" and "Span_Bw"); (iii) total average time, which is the sum of mean time-taken across all trials performed ("TotAvgTime_Fw" and "TotAvgTime_Bw"); and (iv) total correct average time, which is the sum of mean time-taken across all trials correctly performed ("TotCorAvgTime_Fw" and "TotCorAvgTime_Bw").

MRI Data Acquisition
The details of fMRI data acquisition on the same sample have been described previously by our group [63]. Briefly, MRI scanning was performed at the Nathan Kline Institute using a 3.0 Tesla Siemens Tim Trio scanner (Erlangen, Germany). The resting-state fMRI data, used in the current study, were collected during eyes closed alert state for 6.11 s. A series of T2*-weighted gradient echo single-shot echo-planar imaging (EPI) volumes with the following sequence parameters was acquired: TR = 2750 ms; TE = 30 ms; flip angle = 80 • ; voxel size = 2.5 × 2.5 × 3.5 mm 3 ; matrix size = 96 × 96; number of slices = 50; number of volumes = 130; FOV = 240 mm; and Grappa acceleration factor = 3. The sequence was carefully optimized to minimize the effects of magnetic susceptibility inhomogeneities (such as distortions and signal dropouts), as well as the effects of mechanical vibrations, which elevate Nyquist ghosting levels. In addition, a magnetization-prepared rapid gradient-echo (MPRAGE) high-resolution three-dimensional T1-weighted structural image served as an anatomical reference for the fMRI data, as well as for the non-linear registration of imaging data between subjects. The sequence parameters for the MPRAGE were: TR = 2500 ms; TE = 3.5 ms; TI = 1200 ms; flip angle = 8 • ; voxel size = 1 × 1 × 1 mm 3 ; matrix size = 256 × 256 × 192; FOV = 256 mm; and number of averages = 1.

Image Processing
The details of image processing for the resting state fMRI are available in our previous study [63]. Briefly, the intra-subject inter-modality linear registration module [64] of the automatic registration toolbox (ART; www.nitrc.org/projects/art (accessed on 20 December 2019) was used to register the structural (MPRAGE) and functional MRI volumes. The brainwash program within the ART toolbox was used for skull-stripping the MPRAGE volumes. Motion detection and correction were performed using the 3dvolreg module of the AFNI software package [65]. Furthermore, the non-linear registration module of the ART [66] was used to correct for the geometric distortions of the fMRI images due to magnetic susceptibility differences in the head, particularly at brain/air interfaces. The skull-stripped MPRAGE images from all subjects were non-linearly registered to a studyspecific population template using ART's non-linear registration algorithm, which is one of the most accurate inter-subject registration methods available [67]. The population template was formed using an iterative method [68], and the motion-corrected fMRI time-series were detrended using PCA [69]. Finally, fMRI images from all subjects were normalized to a standard space.

Reward Network Seed Regions and rsFC Calculations
The regions of interest (ROIs) for the reward network were identified from the published literature of review and meta-analyses of reward processing, e.g., [6,70,71]. These included 34 ROIs from 17 bilateral structures involving nine bilateral subcortical structures and eight bilateral cortical regions ( Table 2 and Figure 2). The ROIs were marked using ITK-SNAP, an image processing application [72]. The diameters of subcortical and cortical ROIs were 7 mm and 11 mm, respectively, from the MNI coordinates ( Table 2). The ROI-to-ROI connectivity [73], the most commonly used method to derive rsFC across brain regions [74], was computed using Pearson correlation coefficients between all unique pairs (N = 561) of BOLD time series data of all 34 ROIs listed in Table 2. The resulting correlation coefficients were Fisher Z-transformed for further statistical analyses. All 561 connections derived from unique combinations of the ROIs were used in the feature selection process (see Section 2.7), which provided the subset of features to be used in the random forest model (see Section 2.8).

Feature Selection of FC Variables
Recent approaches dealing with machine learning analyses have used a two-stage approach, consisting of feature selection followed by a predictive algorithm using a

Feature Selection of FC Variables
Recent approaches dealing with machine learning analyses have used a two-stage approach, consisting of feature selection followed by a predictive algorithm using a selected sets of variables [75][76][77][78][79]. Feature selection methods are used as the first stage to reduce irrelevant and redundant variables, which may otherwise add noise to the predictive models [75][76][77]. The advantages of feature selection include a better understanding of the data, reducing computation requirements, mitigating the effect of the curse of dimensionality, and also improving the predictor performance [76]. We applied binomial lasso regression [80][81][82] as a feature selection method [83,84], as implemented in R-package "glmnet", to extract a subset of fMRI FC variables (N = 561) that held significant predictive value to discriminate AUD from the CTL group. Feature selection was implemented only for the rsFC variables, due to a high number of connections, most of which were deemed not relevant for our purpose of AUD classification. The method adapted in the current analysis is based on the Lasso method, as implemented in Fonti and Belitser [84]. The maximum number of output features "pmax" was set to 10% (i.e., 56 of the total 561 variables). A 10-fold cross-validation and lambda thresholding with 1 SE (λ 1se ) were set to extract the final set of key variables. The area under the curve (AUC) was plotted to determine the classification performance of the selected features. The final subset resulting from the feature selection process included 21 rsFC variables (see Table A1, Appendix A).

Random Forest Classification
The random forest classification model, as used in the current study, has been described in our previous work on rsFC of the default mode network (DMN) [63]. The predictor variables included in the model were 21 reward network connections identified by feature selection (Table A1), 13 neuropsychological scores consisting of 5 TOLT scores and 8 VST scores (see Section 2.2), and 3 BIS scores (see Section 2.3), while the group status (AUD and CTL) served as the outcome variable (see Table A1, Appendix A). Although age was significantly different between the groups, we did not include age as an input variable in the classification model for the following reasons: (i) as done in our previous publications on the same sample of subjects [63,85], we performed post hoc correlational analysis of age with the significant features of the random forest model, to see if any of the top variables had associations with age in the individual groups or in total sample (see Section 3.4); and (ii) since the age difference between the groups was highly significant, including age as a feature in the classification model would likely artificially increase the accuracy of classification, which would not be desirable. To compute the classification accuracy of the random forest model, we used an out-of-bag (OOB) error estimate. According to Breiman and Cutler [86], in random forests, owing to the inbuilt OOB feature in the model, additional cross-validation is not a requirement to obtain an unbiased estimate of the test sample error, since it is estimated internally in the OOB algorithm. The random forest algorithm constructs each of the decision trees using separate bootstrap subsamples from the training data, and about one-third of the observations from the training data are left out during each bootstrap, called the OOB sample, which are used as a form of test data only, to estimate the prediction accuracy of the RF model. While classification trees are grown for each bootstrap sample (which is approximately two-thirds of the training data), the OOB error rate is calculated for each classification tree built. According to Breiman [87], there are two reasons for using bagging: (i) to enhance the accuracy when random features are used; and (ii) to give ongoing estimates of the generalization error of the combined ensemble of trees, as well as estimates for the strength and correlation. The aggregate of OOB scores on all "ntree" trees (which is the maximum number of trees pre-set in the model calculation) provides the ensemble OOB error rate. Thus, the OOB score provides validation for the RF model. Therefore, unlike in other machine learning algorithms, random forests method does not require separate training data and test data while specifying the model term. In the current study, the maximum number of trees "ntree" was set at 1000. The optimal number of features analyzed at each node ("Mtry") in the model was estimated to be 8 (using the "tuneRF" function). The final list of variables that significantly contributed to the classification was tabulated and sorted based on their importance to classification. For the top significant FC variables, the brain connectivity across ROIs was mapped onto a 3-dimensional ICBM atlas [88] using custom Matlab scripts.

Classification Accuracy and Top Significant Variables
The RF algorithm correctly identified group membership of 26 out of 30 individuals in each group, in classifying them into either AUD or CTL group, with an accuracy rate of 86.67% and the area under the curve of 93% (Figure 3). The OOB error or the misclassification rate was 13.33% (for each group). The model also identified 12 rsFC connections and two impulsivity scores (motor and non-planning) as significantly (p < 0.05) contributing to the classification (Table 3). Relative to the CTL individuals, AUD subjects showed a predominant pattern of hypoconnectivity (i.e., decreased rsFC in 9 out of 12 connections) across the major cortical and subcortical nodes of the reward network, in addition to three connections with hyperconnectivity in specific nodes (i.e., left nucleus accumbens-left posterior cingulate cortex (PCC), right pallidum-right PCC, and right hippocampus-left dorsolateral prefrontal cortex). AUD individuals also showed increased impulsivity in motor and non-planning categories. However, none of the neuropsychological variables were significant based on the p-value criterion.

Classification Accuracy and Top Significant Variables
The RF algorithm correctly identified group membership of 26 out of 30 individuals in each group, in classifying them into either AUD or CTL group, with an accuracy rate of 86.67% and the area under the curve of 93% (Figure 3). The OOB error or the misclassification rate was 13.33% (for each group). The model also identified 12 rsFC connections and two impulsivity scores (motor and non-planning) as significantly (p < 0.05) contributing to the classification (Table 3). Relative to the CTL individuals, AUD subjects showed a predominant pattern of hypoconnectivity (i.e., decreased rsFC in 9 out of 12 connections) across the major cortical and subcortical nodes of the reward network, in addition to three connections with hyperconnectivity in specific nodes (i.e., left nucleus accumbens-left posterior cingulate cortex (PCC), right pallidum-right PCC, and right hippocampus-left dorsolateral prefrontal cortex). AUD individuals also showed increased impulsivity in motor and non-planning categories. However, none of the neuropsychological variables were significant based on the p-value criterion.  Table 3. Random forest importance parameters mean minimal depth, number of nodes, accuracy decrease, Gini decrease, number of trees, times a root, and p-value), and the direction of significance for the top significant variables (p < 0.05) are shown. Two of the impulsivity scores (motor and nonplanning) and 12 rsFC connections were identified as important features to classify individuals into either AUD or CTL group. The variables are sorted based on the p-values.  Table 3. Random forest importance parameters mean minimal depth, number of nodes, accuracy decrease, Gini decrease, number of trees, times a root, and p-value), and the direction of significance for the top significant variables (p < 0.05) are shown. Two of the impulsivity scores (motor and non-planning) and 12 rsFC connections were identified as important features to classify individuals into either AUD or CTL group. The variables are sorted based on the p-values.

Distribution of Minimal Depth
The distribution of minimal depth among the decision trees of the forests for the top significant variables is shown in Figure 4. Minimal depth of a variable represents the depth of the node that splits on that variable and is the closest to the root of the decision tree. The lower mean minimal depth of a variable represents a higher number of observations (participants) categorized in a specific group based on that variable. The ranking based on the minimal depth parameter shows that two of the impulsivity variables are a the top of the importance list, followed by several reward network connections and a neuropsychological feature (total correct score in the forward trials of the visual span test).

Multi-Way Importance
The top significant variables can also be shown by a multi-way importance plot based on any of the three RF importance measures. Figure 5 illustrates the important features that contributed to group classification based on Gini decrease, number of trees, and p-value. These features include: (i) 12 FC variables (representing connections across cortical and subcortical regions) (see Table 2 for details), (ii) one neuropsychological variable (i.e., total number of correct trials in forward span), and (iii) two impulsivity scores (motor and non-planning categories).  The lower mean minimal depth of a feature represents a higher number of observations (participants) categorized in a specific group based on the feature. The number of trees for a feature represents the total number of decision trees in which a split occurs on the feature (see Table 2 for details about the ROI numbers  represented in the FC variables).

Multi-Way Importance
The top significant variables can also be shown by a multi-way importance plot based on any of the three RF importance measures. Figure 5 illustrates the important features that contributed to group classification based on Gini decrease, number of trees, and p-value. These features include: (i) 12 FC variables (representing connections across cortical and subcortical regions) (see Table 2 for details), (ii) one neuropsychological variable (i.e., total number of correct trials in forward span), and (iii) two impulsivity scores (motor and non- Figure 4. The distribution of minimal depth among the trees of the forest for the significant variables is shown in different colors for each level of minimal depth. The mean minimal depth in the distribution for each variable is marked by a vertical black bar overlapped by a value label inside a box. Based on the mean minimal depth values, the importance list comprised 2 BIS scores, 13 FC, and 1 neuropsychological score, which contributed to the RF classification of AUD and CTL individuals. The lower mean minimal depth of a feature represents a higher number of observations (participants) categorized in a specific group based on the feature. The number of trees for a feature represents the total number of decision trees in which a split occurs on the feature (see Table 2 for details about the ROI numbers  represented in the FC variables).
OR PEER REVIEW 12 of 28  Table 2 for details about the ROI numbers (1-34) that are represented in the rsFC variable names.

Correlations among Rankings of RF Parameters
The correlations among the rankings of features based on different RF parameters ( Figure 6) were very high and significant (r > 0.9), suggesting that each of the RF parameters would rank the features in a very similar order while classifying the individuals into either the AUD or CTL group. High correlations among these parameters also suggest that each parameter is very valuable and reliable in terms of its classification performance, lending further support to the utility of the RF technique as a powerful tool for classifying individuals using a set of multi-domain features.  Table 2 for details about the ROI numbers (1-34) that are represented in the rsFC variable names.

Correlations among Rankings of RF Parameters
The correlations among the rankings of features based on different RF parameters ( Figure 6) were very high and significant (r > 0.9), suggesting that each of the RF parameters would rank the features in a very similar order while classifying the individuals into either the AUD or CTL group. High correlations among these parameters also suggest that each parameter is very valuable and reliable in terms of its classification performance, lending further support to the utility of the RF technique as a powerful tool for classifying individuals using a set of multi-domain features. Behav. Sci. 2022, 12, x FOR PEER REVIEW 13 of 28 Figure 6. Illustration of rankings of features based on correlation between any two random forest (RF) parameters. The panels in the lower triangle of the grid show the distribution of rankings of all predictor variables with black dots along a blue trend line. The panels in the upper triangle of the grid show correlation coefficients across rankings of any two parameters. It is shown that all RF parameters of importance were found to have very high correlations among each other, suggesting the high reliability of each of these parameters in ranking the importance of features for classification. The asterisks (***) represents that the correlations were highly significant (p < 0.001).

Connectivity Mapping of Significant rsFC Connections
Significant reward network connections are illustrated in Figure 7. Among the 12 significant connections, nine were hypoconnected and three were hyperconnected in AUD individuals, involving 17 regions (7 from the left and 10 from the right hemisphere) of the 34 reward network ROIs. While the majority of these nodes (12 of 17) were of solo paths, connecting to another single node (ROI # 2, 3,  . Illustration of rankings of features based on correlation between any two random forest (RF) parameters. The panels in the lower triangle of the grid show the distribution of rankings of all predictor variables with black dots along a blue trend line. The panels in the upper triangle of the grid show correlation coefficients across rankings of any two parameters. It is shown that all RF parameters of importance were found to have very high correlations among each other, suggesting the high reliability of each of these parameters in ranking the importance of features for classification. The asterisks (***) represents that the correlations were highly significant (p < 0.001).

Connectivity Mapping of Significant rsFC Connections
Significant reward network connections are illustrated in Figure 7. Among the 12 significant connections, nine were hypoconnected and three were hyperconnected in AUD individuals, involving 17 regions (7 from the left and 10 from the right hemisphere) of the 34 reward network ROIs. While the majority of these nodes (12 of 17) were of solo paths, connecting to another single node (ROI # 2, 3,

Correlations among the Top Significant Variables
Exploratory (descriptive) analysis using zero-order correlations among the top significant variables is shown in Figure 8

Correlations among the Top Significant Variables
Exploratory (descriptive) analysis using zero-order correlations among the top significant variables is shown in Figure 8  Values within each cell represent a bivariate Pearson correlation between the variable on its vertical axis and the variable on its horizontal axis. Correlation coefficients are color-coded (red/pink shades represent negative r-values, blue/cyan shades indicate positive r-values, darker color represent higher magnitude) and significant correlations (before Bonferroni correction) have been marked with asterisks in black font (* p < 0.05; ** p < 0.01; *** p < 0.001). The significant correlations that survived Bonferroni correction have been marked with +++ sign in white font (+++ Significant after Bonferroni correction). Refer to Table 2 for details about the ROI numbers  that are represented in the rsFC variable names. Acronyms: FC-Functional connectivity, TotCor_Fw-total number of correctly performed forward trials, BIS-Barratt Impulsiveness Scale, NP-Non-planning, MI-Motor Impulsivity.

Correlations between Significant Variables and Age
Since the age difference across the groups was statistically significant (p < 0.001), the association of age with significant predictor variables from the RF analysis was calculated within each group using bivariate Pearson correlation and in the total sample using partial correlation adjusted for the group effect (Table 4) as an exploratory (descriptive) analysis. It was found that there was no significant association of age with the top significant variables in any of the groups or in the total sample, after correcting for multiple comparisons. However, a single FC variable (FC_6_7) representing the connectivity between the right amygdala and left hippocampus was significant (r = 0.38; p = 0.0374), but it did not survive multiple testing correction. Table 4. Pearson bivariate correlations between the age of the participant and the top significant variables of the RF model. Correlation coefficient (r) and p-values (before Bonferroni correction) are provided for alcohol use disorder (AUD), control (CTL) group, and the total sample (ALL). None of Values within each cell represent a bivariate Pearson correlation between the variable on its vertical axis and the variable on its horizontal axis. Correlation coefficients are color-coded (red/pink shades represent negative r-values, blue/cyan shades indicate positive r-values, darker color represent higher magnitude) and significant correlations (before Bonferroni correction) have been marked with asterisks in black font (* p < 0.05; ** p < 0.01; *** p < 0.001). The significant correlations that survived Bonferroni correction have been marked with +++ sign in white font (+++ Significant after Bonferroni correction). Refer to Table 2 for details about the ROI numbers (1-34) that are represented in the rsFC variable names. Acronyms: FC-Functional connectivity, TotCor_Fw-total number of correctly performed forward trials, BIS-Barratt Impulsiveness Scale, NP-Non-planning, MI-Motor Impulsivity.

Correlations between Significant Variables and Age
Since the age difference across the groups was statistically significant (p < 0.001), the association of age with significant predictor variables from the RF analysis was calculated within each group using bivariate Pearson correlation and in the total sample using partial correlation adjusted for the group effect (Table 4) as an exploratory (descriptive) analysis. It was found that there was no significant association of age with the top significant variables in any of the groups or in the total sample, after correcting for multiple comparisons. However, a single FC variable (FC_6_7) representing the connectivity between the right amygdala and left hippocampus was significant (r = 0.38; p = 0.0374), but it did not survive multiple testing correction. Table 4. Pearson bivariate correlations between the age of the participant and the top significant variables of the RF model. Correlation coefficient (r) and p-values (before Bonferroni correction) are provided for alcohol use disorder (AUD), control (CTL) group, and the total sample (ALL). None of the variables survived Bonferroni correction for multiple comparisons. Zero-order correlations were used for each group separately (N = 30) and partial correlations controlling for group effects were used for the all sample (N = 60).  Table 2 for the details of the ROIs in the FC variable.

Neuropsychological Scores between the Groups
Since the rankings of neuropsychological variables varied widely across different parameters of the random forest classification, these features were statistically compared across the participant groups using one-way analysis of variance (ANOVA) to determine the level of significance (see Table 5). Only two variables from the visual span test (i.e., TotCor_Fw and Span_Fw) were significant after Bonferroni corrections. The score "Tot-Cor_Fw" (total number of correctly performed forward trials), which showed the highest significance level, was also identified by some of the parameters of the random forest as a variable contributing to group classification.

Discussion
The goal of the present study was to identify specific features from a set of multidomain measures, including functional connectivity in the reward network, neuropsychological performance, and impulsivity, to classify individuals with AUD from healthy controls. The results showed that the random forest algorithm was highly successful in identifying the key features that contributed significantly to differentiating AUD from CTL individuals. Relative to controls, AUD individuals manifested (i) alterations in functional connectivity across reward network regions (including ventral tegmental area, nucleus accumbens, anterior insula, anterior cingulate cortex, and other cortical and subcortical structures), showing hypoconnectivity in nine connections and hyperconnectivity in three connections, (ii) increased impulsivity in motor and non-planning categories, and (iii) poorer neuropsychological performance, in terms of total number of correct trials in the forward sequence of the visual-spatial memory span test. In summary, relative to healthy controls, AUD individuals manifested aberrant functional connectivity in the reward network, increased impulsivity, and poor neuropsychological performance in visual-spatial memory.

Altered Functional Connectivity across Reward Network in AUD Individuals
Addiction to drugs and alcohol involves a cascade of neuroadaptive processes, causing changes in the brain circuitries at different stages of addiction [18,89,90]. Our findings on resting state FC in the reward network indicate that AUD subjects manifested alterations in connectivity patterns, in terms of hypoconnectivity in nine connections and hyperconnectivity in three connections, involving 17 key reward structures [see Figure 7]. In particular, out of the nine reward network functional connections that showed hypoconnectivity, three were cortico-cortical connections (R.Ins-R.ACC, R.ACC-R.OFC, and R.ACC-R.PCC) in the right hemisphere, and five were subcortical-subcortical connections (R.Amg-L.Hip, L.Cdt-R.Pal, L.Tha-R.Tha, L.Cdt-L.Tha, and L.Tha-R.Ptm) involving both hemispheres, and a single inter-hemispheric subcortical-cortical connection (R.VTA-L.ACC). The three subnetworks that were hyperconnected in AUD individuals were subcortical-cortical connections, involving a left-hemispheric connection (L.NAc-L.PCC), a right-hemispheric connection (R.Pal-R.PCC), and an inter-hemispheric connection (R.Hip-L.DLP). These findings of altered brain connectivity in AUD individuals may be suggestive of neuroadaptation in the hub regions of the reward network, caused by chronic alcohol intake. In general, previous fMRI studies have reported such aberrations in resting state connectivity underlying multiple brain networks in individuals with SUD [19,28,91], including those with AUD [30][31][32][33]63,[92][93][94]. Our previous study on the same sample of participants as in the current study reported that AUD individuals manifested altered default mode network (DMN) connectivity compared to controls [63]. It is clear from the findings of our past and current rsFC studies that AUD individuals manifest brain connectivity changes across neural structures involved in spontaneous, self-referential thoughts, as elicited by the DMN [95], as well as the commonly reported reward processing deficits, as elicited by the RN [10]. It is also remarkable to note that both studies showed hypoconnectivity between the ACC and PCC nodes in AUD subjects, confirming the notion of abnormal self-referential processing in these individuals [92]. Furthermore, hypoconnectivity across reward structures during resting state, as found in the current study, may also indicate a vulnerability to relapse in individuals with a history of SUD [96]. Taken together, these findings lend support to the findings of the current study, that individuals with a long history of alcohol use continue to manifest network abnormalities across the reward regions, in addition to the previously reported aberrations in DMN, even after prolonged abstinence from alcohol consumption.
The predominant pattern observed in AUD individuals was hypoconnectivity (9 out of 12 connections) across subcortical reward regions (5 connections) followed by cortical (3 connections) and cortical-subcortical (1 connection) subnetworks. Specifically, these hypoconnected nodes include key reward regions such as the VTA, amygdala, hippocam-pus, thalamus, pallidum, putamen, insula, ACC, PCC, and OFC, possibly indicating lower or less efficient neural communication across these subnetworks [6]. Since addiction has been characterized as a reward deficiency syndrome [97], hypoconnectivity across these reward structures in abstinent AUD subjects may indicate reduced responsiveness to rewarding stimuli during resting state, possibly due to decreased dopamine function in these individuals [50,98]. Although elevated levels of dopamine in the dorsal striatum are associated with motivation to seek and consume alcohol and drugs, long-term substance use is associated with decreased dopaminergic function, as evidenced by reductions in D2 dopamine receptors and dopamine release in the striatum in addicted subjects [50], which can also lead to reduced activity in other cortical reward regions such as the orbitofrontal cortex and cingulate gyrus, resulting in loss of control and compulsive substance use. It may be worth noting that the majority of the hypoconnected regions in AUD (9 out of 13 regions) involved the right hemisphere, implicating hemispheric asymmetry in connectivity in AUD and, hence, their functional attributes [99]. For instance, laterality studies on motivation and emotions reported that the right hemisphere responds more to unpredicted, urgent, and novel environmental events, while the left hemisphere engages with routine and habitual behaviors [100]. Therefore, it is possible that the alterations in resting state brain connectivity across reward network seen in AUD individuals have more impact on right hemispheric function, including novelty seeking and impulsivity [101].
The other FC finding was that AUD individuals manifested hyperconnectivity in three connections across five brain regions, i.e., nucleus accumbens, pallidum, hippocampus, posterior cingulate, and dorsolateral prefrontal cortex, possibly suggesting excessive and/or less focused communication during resting state among these structures. While each of these key regions is associated with distinct and shared neurocognitive processes, hyperconnectivity across these nodes during resting state can be generally interpreted as excessive rumination about reward and preoccupation with reward-related imagery or inherent reward-seeking tendencies, such as craving. Similar to our finding, a higher FC between nucleus accumbens and posterior cingulate gyrus was observed in relapsers compared to abstainers of stimulant use [96]. In addition, akin to our finding, higher hippocampal-prefrontal connectivity has also been observed in internet gaming addiction [102]. Furthermore, recent evidence implicating the pallidum as an important structure in mesocorticolimbic reward processing [103], and, therefore, the increased connectivity between pallidum and PCC, may indicate reward related tendencies (e.g., drug seeking) in the resting state. However, despite the growing number of fMRI brain connectivity studies on AUD and other SUDs during resting state and task conditions [19,28,63,94,102,104], more studies are needed to understand the exact role of specific connectivity patterns across the reward network.

Heightened Impulsivity in AUD Individuals
The findings of the current study also showed that motor and non-planning impulsivity components were the topmost features contributing to the classification of AUD individuals from controls. This finding reinforces the long-held notion that AUD and other externalizing traits are part of the externalizing spectrum disorders [105][106][107][108][109]. It is also known that AUD is associated with making impulsive choices during decision-making [16]. It is possible that the increased impulsivity manifested by AUD individuals may be due to altered FC across the frontal nodes [63] and/or relatively lower brain volumes of the frontal regions, as reported in our previous study [54]. Evidence from the imaging literature strongly suggest that both structural and functional aspects of the frontal lobes contribute to increased impulsivity in AUD patients [110][111][112][113]. Furthermore, recent studies have also reported associations between impulsivity and resting state measures of EEG power [114], EEG-based FC [115], and fMRI-based FC [36], suggesting that specific brain networks may mediate aspects of impulsivity in AUD, as well as other externalizing disorders. Therefore, identifying and quantifying behavioral impulsivity may contribute to improving prevention and intervention programs related to alcohol and other substance use problems [116,117]. Although attentional impulsivity was not found to significantly contribute to the classification, our previous studies [63,85] found contributions from all three components of impulsivity, while motor and non-planning aspects were top of the key features of classification, suggesting their relative importance in AUD pathology.

Poorer Memory Span in AUD Individuals
In the current study, the neuropsychological score "TotCor_Fw" (i.e., total number of correctly performed forward trials) from the visual span test was also identified as one of the key variables contributing to the classification of AUD individuals from controls by the RF model. On the other hand, parametric group comparison of neuropsychological variables (Table 5) revealed that the AUD group performed poorly in both "TotCor_Fw" and "Span_Fw" (i.e., span of recalled items during forward trials), compared to controls. Interestingly, these two variables tapping short-term memory performance were also found to be significant in our previous classification studies with the same sample of participants [63,85]. Furthermore, there is a strong literature support for memory deficits in individuals with chronic AUD [118][119][120][121], and some of the deficits linger even after prolonged abstinence [37]. It is also worth noting that in our previous structural MRI study on the same groups of subjects, we found that the AUD group showed lower volumes in prefrontal cortex and left hippocampus, which were associated with poorer visuospatial memory performance [54]. Furthermore, in another study on the same sample, we found that AUD subjects manifested hyperconnectivity across the parahippocampal hub regions [63], adding support to the current finding related to memory deficits. On the other hand, it is surprising that none of the scores related to executive functioning in the TOLT were significantly different from controls, possibly suggesting a partial or complete recovery of these functions in the AUD group due to abstinence. It is also possible that the current study failed to capture deficits in additional domains, as the data involved in the current study were limited to only two tests and the sample size was only modest. Future studies may employ a comprehensive and sensitive battery of neuropsychological tests in a larger sample of abstinent AUD individuals, to map neuropsychological performance in multiple domains.

Correlations of Significant Variables among Themselves and with Age
Correlations among the significant variables that were identified by the RF classification model revealed three highly significant positive associations among the FC features (L.Cdt-R.Pal with L.Cdt-L.Tha; L.Tha-R.Tha with L.Tha-R.Ptm; and R.Ins-R.ACC with R.ACC-R.PCC), and each of these significant pairs had a common node, i.e., L.Cdt, L.Tha, and R.ACC, respectively. While it is expected that the pairs with a common node would correlate with each other, they are also known to be structurally connected. For instance, while the caudate nucleus connects the pallidum with radial fibers [122], these basal ganglia structures have reciprocal connections with the thalamus and cortical regions and, thus, mediate cognitive and motor functions [123]. Similarly, while the ACC has both structural and functional connectivity with the PCC as part of the DMN [124], reciprocal interaction between the ACC and anterior insula serves as a major connection with the salience and reward network [125]. Interestingly, the negative correlation between memory performance (working memory) and impulsivity observed in our study was also previously reported by other studies on individuals with substance use disorders [126,127]. Lastly, as expected, non-planning and motor impulsivity were positively correlated with each other [128], and both of these dimensions were shown to be associated with craving and relapse among alcohol-dependent males [129]. On the other hand, none of the correlations between the top predictor variables and age sustained a statistical significance after multiple testing corrections, suggesting that age did not impact the classification of groups based on these top predictors. However, it is suggested that future studies may confirm these preliminary findings using a large sample of subjects involving both males and females matched on multiple characteristics, such as education, ethnicity, premorbid IQ, etc.

Limitations of the Current Study
The current study has several limitations. (i) The sample consisted of only males, and the findings may not be generalizable to females. In general, findings from prediction-based studies will be more useful when they are generalizable to different strata of the population from which the sample is drawn. Therefore, we suggest that future predictive studies aim for samples from both genders. (ii) The outcome groups (AUD and CTL) were not matched for age, as the age difference was statistically significant. Although age was not significantly associated with the key features, either in each group or in the total sample, the results would have been more credible if the groups were matched for age. Therefore, future studies should also aim to apply predictive models on age-matched groups, to avoid confounding the results. (iii) Finally, the sample size for a predictive model was rather small, although the random forest algorithm is known to handle such situations more effectively than other machine learning models. Therefore, similar studies with larger sample sizes are needed to confirm the findings of the present study. A larger sample is also warranted to explore associations among the features from multiple domains, since the current study did not identify potential associations among the features, possibly due to a lack of statistical power. Given these limitations, it is to be noted that the obtained results are only preliminary, while the findings of the present study might help to design future studies avoiding or mitigating these limitations.

Summary and Conclusions
The findings of the current study suggest that multidomain features drawn from the measures of brain connectivity, impulsivity, and neuropsychological tests can be successfully used in a machine learning framework to classify AUD individuals from healthy controls. In summary, our study revealed that the abstinent individuals with past AUD showed impaired brain connectivity across specific reward regions and also manifested relatively increased impulsivity and poor memory function. Evidence from the literature suggests that these anomalies may have been caused by neuroadaptation due to chronic drinking. Since treatment interventions intended to reverse these neuroadaptations show promise as potential therapeutic approaches for addiction [1], the findings of the current study may have important clinical implications. We suggest that future studies should further characterize these neural and behavioral abnormalities at multiple levels and groups, such as gender, race/ethnicity, educational attainment, socio-economic status, genomic liability, and drinking patterns, so that the findings may contribute towards personalized medicine. 16